GRAEL: an agent-based evolutionary computing approach for natural language grammar development
نویسنده
چکیده
This paper describes an agent-based evolutionary computing technique called GRAEL (Grammar Evolution), that is able to perform different natural language grammar optimization and induction tasks. Two different instantiations of the GRAELenvironment are described in this paper: in GRAEL1 large annotated corpora are used to bootstrap grammatical structure in a society of agents, who engage in a series of communicative attempts, during which they redistribute grammatical information to reflect useful statistics for the task of parsing. In GRAEL-2, agents are allowed to mutate grammatical information, effectively implementing grammar rule discovery in a practical context. A combination of both GRAEL-1 and GRAEL-2 can be shown to provide an interesting all-round optimization for corpus-induced grammars. gins of grammar in a computational context [Batali, 2002; Kirby, 2001J or the co-ordinated co-evolution of grammatical principles [Briscoe, 1998]. Yet so far, little or no progress has been achieved in evaluating evolutionary computing as a tool for the induction or optimization of data-driven parsing techniques. The GRAEL environment provides a suitable framework for the induction and optimization of any type of grammar for natural language in an evolutionary setting. In this paper we hope to provide an overview of GRAEL as a grammar optimization and induction technique. We wil l first outline the basic architecture of the GRAEL environment in Section 2 on the basis of a toy example. Next, we introduce GRAEL1 (Section 3) as a grammar optimization technique that can enhance corpus-induce grammars. By adding an element of mutation in GRAEL-2 we implement a method to extend the coverage of a corpus-induced grammar. We will also describe a combination of both GRAEL-1 and GRAEL-2 which can be shown to provide an interesting all-round optimization technique for corpus-induced grammars.
منابع مشابه
Evolutionary Computing as a Tool for Grammar Development
In this paper, an agent-based evolutionary computing technique is introduced, that is geared towards the automatic induction and optimization of grammars for natural language (grael). We outline three instantiations of the grael-environment: thegrael-1 system uses large annotated corpora to bootstrap grammatical structure in a society of autonomous agents, that tries to optimally redistribute g...
متن کاملAgent-Based Unsupervised Grammar Induction
In this paper, we describe an agent-based evolutionary computing approach to unsupervised grammar induction called grael (Grammar Evolution). Extending a general framework for data driven grammar optimization and induction, the evolutionary setup of grael can be used to automatically induce and optimize grammars from scratch on the basis of unstructured text. Agents are equipped with a very bas...
متن کاملA MODEL FOR EVOLUTIONARY DYNAMICS OF WORDS IN A LANGUAGE
Human language, over its evolutionary history, has emerged as one of the fundamental defining characteristic of the modern man. However, this milestone evolutionary process through natural selection has not left any ’linguistic fossils’ that may enable us to trace back the actual course of development of language and its establishment in human societies. Lacking analytical tools to fathom the cr...
متن کاملLearning Classifier System Approach to Natural Language Grammar Induction
This paper describes an evolutionary approach to the problem of inferring non-stochastic context-free grammar (CFG) from natural language (NL) corpora. The approach employs Grammar-based Classifier System (GCS). GCS is a new version of Learning Classifier Systems in which classifiers are represented by CFG in Chomsky Normal Form. GCS has been tested on the NL corpora, and it provided comparable...
متن کاملA Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...
متن کامل